Search CORE

Cortical modulation of neuronal activity in the cat's lateral geniculate and perigeniculate nuclei: a modeling study

Author: Andrzej Wróbel
Daniel K Wójcik
F Wörgötter
G Ahlsen
GT Einevoll
J Huguenard
Jacek Rogala
U Hillenbrand
Wioletta Waleszczyk
WJ Waleszczyk
Publication venue: BioMed Central
Publication date: 18/07/2011
Field of study

Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions

Author: A Arleo
B Espiau
C Breazeal
CJ Watkins
DJ Foster
F Chaumette
F Chaumette
F Wörgötter
Florentin Wörgötter
G Tesauro
GJ Gordon
J Peters
J Soechting
J Soechting
M Moussa
M Moussa
M Tamosiunaite
Minija Tamosiunaite
R Dillmann
R Horaud
R Sutton
RJ Williams
RS Sutton
RS Sutton
T Strösslin
Tamim Asfour
V Ruis de Angulo
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Reinforcement learning methods can be used in robotics applications especially for specific target-oriented problems, for example the reward-based recalibration of goal directed actions. To this end still relatively large and continuous state-action spaces need to be efficiently handled. The goal of this paper is, thus, to develop a novel, rather simple method which uses reinforcement learning with function approximation in conjunction with different reward-strategies for solving such problems. For the testing of our method, we use a four degree-of-freedom reaching problem in 3D-space simulated by a two-joint robot arm system with two DOF each. Function approximation is based on 4D, overlapping kernels (receptive fields) and the state-action space contains about 10,000 of these. Different types of reward structures are being compared, for example, reward-on- touching-only against reward-on-approach. Furthermore, forbidden joint configurations are punished. A continuous action space is used. In spite of a rather large number of states and the continuous action space these reward/punishment strategies allow the system to find a good solution usually within about 20 trials. The efficiency of our method demonstrated in this test scenario suggests that it might be possible to use it on a real robot for problems where mixed rewards can be defined in situations where other types of learning might be difficult

Vytautas Magnus University Institutional Repository (VMU ePub)

Public Library of Science (PLOS)

Reinforcement learning or active inference?

This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain

CiteSeerX

Directory of Open Access Journals

Coverage, Continuity and Visual Cortical Architecture

Author: A Antonini
A Cimponeriu
A Das
A Grabska-Barwinska
A Grinvald
A Newell
A Turing
AA Koulakov
B Chapman
BJ Farley
BS Wang
C Beaulieu
C Blakemore
C Chiu
C von der Malsburg
CE Giacomantonio
CE Giacomantonio
DH Hubel
DH Hubel
DJ Field
DL Applegate
DL Ringach
DN Spinelli
DR Muir
E Bartfeld
E Erwin
E Erwin
F Hoffsümmer
F Wolf
F Wolf
F Wolf
F Wolf
F Wolf
F Wörgötter
G Blasdel
GG Blasdel
GG Blasdel
GJ Goodhill
GJ Goodhill
GJ Goodhill
GJ Goodhill
H Ritter
H Ritter
H Simpson
H Yu
H Yu
HC Lee
HU Bauer
HY Lee
JC Crowley
JC Horton
JC Horton
JH Kaas
JO Kriegs
JO Kriegs
JR Wible
JS Lund
K Kang
K Obermayer
K Obermayer
K Obermayer
K Obermayer
K Ohki
K Pawelzik
K Rose
KD Miller
L Reichl
L Reichl
L Reichl
LE White
LE White
M Abramowitz
M Huang
M Kaschube
M Kaschube
M Kaschube
M Kaschube
M Kaschube
M Kaschube
M Schnabel
M Schnabel
M Schnabel
M Stetter
MA Carreira-Perpiñán
MA Carreira-Perpiñán
MA Carreira-Perpiñán
MC Cross
MC Cross
N Mayer
N Mermin
NM Mayer
NV Swindale
NV Swindale
NV Swindale
NV Swindale
NV Swindale
NV Swindale
NV Swindale
OR Bininda-Emonds
P Berkes
P Buzas
P Dayan
P Manneville
PA Hetherington
PC Bressloff
PC Bressloff
PJ Thomas
PJ Thomas
R Durbin
R Durbin
R Durbin
R Linsker
R Linsker
S Schuett
S Tanaka
S Wimbauer
SDV Hooser
T Bonhoeffer
T Bonhoeffer
T Hensch
T Kohonen
T Kohonen
UA Ernst
V Braitenberg
W Keil
WH Bosking
WH Bosking
Y Frégnac
Y Frégnac
Z Kielian-Jaworowsk
Publication venue
Publication date: 01/01/2011
Field of study

The primary visual cortex of many mammals contains a continuous representation of visual space, with a roughly repetitive aperiodic map of orientation preferences superimposed. It was recently found that orientation preference maps (OPMs) obey statistical laws which are apparently invariant among species widely separated in eutherian evolution. Here, we examine whether one of the most prominent models for the optimization of cortical maps, the elastic net (EN) model, can reproduce this common design. The EN model generates representations which optimally trade of stimulus space coverage and map continuity. While this model has been used in numerous studies, no analytical results about the precise layout of the predicted OPMs have been obtained so far. We present a mathematical approach to analytically calculate the cortical representations predicted by the EN model for the joint mapping of stimulus position and orientation. We find that in all previously studied regimes, predicted OPM layouts are perfectly periodic. An unbiased search through the EN parameter space identifies a novel regime of aperiodic OPMs with pinwheel densities lower than found in experiments. In an extreme limit, aperiodic OPMs quantitatively resembling experimental observations emerge. Stabilization of these layouts results from strong nonlocal interactions rather than from a coverage-continuity-compromise. Our results demonstrate that optimization models for stimulus representations dominated by nonlocal suppressive interactions are in principle capable of correctly predicting the common OPM design. They question that visual cortical feature representations can be explained by a coverage-continuity-compromise.Comment: 100 pages, including an Appendix, 21 + 7 figure

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Motion processing with wide-field neurons in the retino-tecto-rotundal pathway

Author: A Mahani
A Nguyen
A Revzin
A Schmidt
AV Laverghetta
B Bessete
B Dellen
B Dellen
B Frost
B Hellmann
B Pakkenberg
Babette Dellen
C Deng
C Koch
D Heeger
EH Adelson
F Prevost
F Wörgötter
Florentin Wörgötter
G Marin
GE Hinton
H Karten
H Luksch
H Luksch
H Sun
H Sun
J Engelage
J Letelier
J Mpodozis
JL Barron
John W. Clark
K Macko
K Nakayama
L Wu
LI Benowitz
M Hennig
N Troje
O Güntürkün
P Dayan
P Mulvanny
R Granit
R Khanbabaie
Ralf Wessel
RD Mooney
S Watanabe
T Ngo
U Meyer
W Hodos
W Hodos
W Hodos
W Hodos
Y Gu
Y Wang
Y Wang
Publication venue: Springer US
Publication date: 01/01/2009
Field of study

The retino-tecto-rotundal pathway is the main visual pathway in non-mammalian vertebrates and has been found to be highly involved in visual processing. Despite the extensive receptive fields of tectal and rotundal wide-field neurons, pattern discrimination tasks suggest a system with high spatial resolution. In this paper, we address the problem of how global processing performed by motion-sensitive wide-field neurons can be brought into agreement with the concept of a local analysis of visual stimuli. As a solution to this problem, we propose a firing-rate model of the retino-tecto-rotundal pathway which describes how spatiotemporal information can be organized and retained by tectal and rotundal wide-field neurons while processing Fourier-based motion in absence of periodic receptive-field structures. The model incorporates anatomical and electrophysiological experimental data on tectal and rotundal neurons, and the basic response characteristics of tectal and rotundal neurons to moving stimuli are captured by the model cells. We show that local velocity estimates may be derived from rotundal-cell responses via superposition in a subsequent processing step. Experimentally testable predictions which are both specific and characteristic to the model are provided. Thus, a conclusive explanation can be given of how the retino-tecto-rotundal pathway enables the animal to detect and localize moving objects or to estimate its self-motion parameters

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Springer

Digital.CSIC

Mathematical properties of neuronal TD-rules and differential Hebbian learning: a comparison

Author: A Arleo
A Barto
A Saudargiene
AH Klopf
AH Klopf
B Porr
B Porr
B Porr
Bernd Porr
Christoph Kolodziejski
CJCH Watkins
CL Hull
CL Hull
DJ Foster
F Wörgötter
Florentin Wörgötter
GQ Bi
H Markram
IH Witten
JC Magee
JD Miller
JL Krichmar
JP Pfister
LP Kaelbling
M Tsukamoto
P Dayan
P Dayan
P Dayan
P Manoonpong
P Roberts
PR Montague
PR Montague
R Sutton
RE Suri
RE Suri
RE Suri
RE Suri
RE Suri
RS Sutton
RS Sutton
RS Sutton
RV Florian
SP Singh
T Kulvicius
T Strösslin
TB Boykina
W Gerstner
W Schultz
W Schultz
Y Humeau
Publication venue: Springer-Verlag
Publication date: 01/01/2008
Field of study

A confusingly wide variety of temporally asymmetric learning rules exists related to reinforcement learning and/or to spike-timing dependent plasticity, many of which look exceedingly similar, while displaying strongly different behavior. These rules often find their use in control tasks, for example in robotics and for this rigorous convergence and numerical stability is required. The goal of this article is to review these rules and compare them to provide a better overview over their different properties. Two main classes will be discussed: temporal difference (TD) rules and correlation based (differential hebbian) rules and some transition cases. In general we will focus on neuronal implementations with changeable synaptic weights and a time-continuous representation of activity. In a machine learning (non-neuronal) context, for TD-learning a solid mathematical theory has existed since several years. This can partly be transfered to a neuronal framework, too. On the other hand, only now a more complete theory has also emerged for differential Hebb rules. In general rules differ by their convergence conditions and their numerical stability, which can lead to very undesirable behavior, when wanting to apply them. For TD, convergence can be enforced with a certain output condition assuring that the δ-error drops on average to zero (output control). Correlation based rules, on the other hand, converge when one input drops to zero (input control). Temporally asymmetric learning rules treat situations where incoming stimuli follow each other in time. Thus, it is necessary to remember the first stimulus to be able to relate it to the later occurring second one. To this end different types of so-called eligibility traces are being used by these two different types of rules. This aspect leads again to different properties of TD and differential Hebbian learning as discussed here. Thus, this paper, while also presenting several novel mathematical results, is mainly meant to provide a road map through the different neuronally emulated temporal asymmetrical learning rules and their behavior to provide some guidance for possible applications